A Fast Algorithm for Detecting Gene-gene Interactions in Genome-wide Association Studies.

نویسندگان

  • Jiahan Li
  • Wei Zhong
  • Runze Li
  • Rongling Wu
چکیده

With the recent advent of high-throughput genotyping techniques, genetic data for genome-wide association studies (GWAS) have become increasingly available, which entails the development of efficient and effective statistical approaches. Although many such approaches have been developed and used to identify single-nucleotide polymorphisms (SNPs) that are associated with complex traits or diseases, few are able to detect gene-gene interactions among different SNPs. Genetic interactions, also known as epistasis, have been recognized to play a pivotal role in contributing to the genetic variation of phenotypic traits. However, because of an extremely large number of SNP-SNP combinations in GWAS, the model dimensionality can quickly become so overwhelming that no prevailing variable selection methods are capable of handling this problem. In this paper, we present a statistical framework for characterizing main genetic effects and epistatic interactions in a GWAS study. Specifically, we first propose a two-stage sure independence screening (TS-SIS) procedure and generate a pool of candidate SNPs and interactions, which serve as predictors to explain and predict the phenotypes of a complex trait. We also propose a rates adjusted thresholding estimation (RATE) approach to determine the size of the reduced model selected by an independence screening. Regularization regression methods, such as LASSO or SCAD, are then applied to further identify important genetic effects. Simulation studies show that the TS-SIS procedure is computationally efficient and has an outstanding finite sample performance in selecting potential SNPs as well as gene-gene interactions. We apply the proposed framework to analyze an ultrahigh-dimensional GWAS data set from the Framingham Heart Study, and select 23 active SNPs and 24 active epistatic interactions for the body mass index variation. It shows the capability of our procedure to resolve the complexity of genetic control.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genome-wide Association Study to Identify Genes and Biological Pathways Associated with Type Traits in Cattle using Pathway Analysis

Extended Abstract Introduction and Objective: Type traits describing the skeletal characteristics of an animal are moderately to strongly genetically correlate with other economically important traits in cattle including fertility, longevity and carcass traits. The present study aimed to conduct a genome wide association studies (GWAS) based on gene-set enrichment analysis for identifying the ...

متن کامل

Unveiling the genetic loci for a panicle developmental trait using genome-wide association study in rice

Panicle size has a high correlation with grain yield in rice. There is a bottleneck to identify the additional quantitative trait loci (QTL) for panicle size due to the conventional traits used for QTL mapping. To identify more genetic loci for panicle size, a panicle developmental trait (LNTB, the length from panicle neck-knot to the first primary branch in the rachis) related to panicle size ...

متن کامل

Detecting genome-wide epistases based on the clustering of relatively frequent items

MOTIVATION In genome-wide association studies (GWAS), up to millions of single nucleotide polymorphisms (SNPs) are genotyped for thousands of individuals. However, conventional single locus-based approaches are usually unable to detect gene-gene interactions underlying complex diseases. Due to the huge search space for complicated high order interactions, many existing multi-locus approaches ar...

متن کامل

P30: Are There Anxious Genes?

Anxiety comprises many clinical descriptions and phenotypes. A genetic predisposition to anxiety is undoubted; however, the nature and extent of that contribution is still unclear. Extensive genetic studies of the serotonin (5-hydroxytryptamine, 5-HT) transporter (5-HTT) gene have revealed how variation in gene expression can be correlated with anxiety phenotypes. Complete genome-wide linkage s...

متن کامل

P-41: Association Study of MICA*008 Gene Polymorphism with Chlamydia Trachomatis Infection in Infertile Men Reffer to Royan Institute

Background: Chlamydia trachomatis(CT) is an obligate intracellular bacteria, requires living cells to replicate itself. CT infection can remain up to 4 years in the couple and affect their fertility. The relationship between CT and infertility is very important because most patients are asymptomatic and untreated. After infection with CT, NK activation signals begin through interactions of its ...

متن کامل

Identification of gene-gene interaction using principal components

After more than 200 genome-wide association studies, there have been some successful identifications of a single novel locus. Thus, the identification of single-nucleotide polymorphisms (SNP) with interaction effects is of interest. Using the Genetic Analysis Workshop 16 data from the North American Rheumatoid Arthritis Consortium, we propose an approach to screen for SNP-SNP interaction using ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • The annals of applied statistics

دوره 8 4  شماره 

صفحات  -

تاریخ انتشار 2014